Search CORE

14 research outputs found

Self-Paced Multitask Learning with Shared Knowledge

Author: Carbonell Jaime
Murugesan Keerthiram
Publication venue
Publication date: 19/06/2017
Field of study

This paper introduces self-paced task selection to multitask learning, where instances from more closely related tasks are selected in a progression of easier-to-harder tasks, to emulate an effective human education strategy, but applied to multitask machine learning. We develop the mathematical foundation for the approach based on iterative selection of the most appropriate task, learning the task parameters, and updating the shared knowledge, optimizing a new bi-convex loss function. This proposed method applies quite generally, including to multitask feature learning, multitask learning with alternating structure optimization, etc. Results show that in each of the above formulations self-paced (easier-to-harder) task selection outperforms the baseline version of these methods in all the experiments

arXiv.org e-Print Archive

Crossref

CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS

Author: Murugesan Keerthiram
Publication venue: UKnowledge
Publication date: 01/01/2011
Field of study

A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking

University of Kentucky

Eye of the Beholder: Improved Relation Generalization for Text-based Reinforcement Learning Agents

Author: Chaudhury Subhajit
Murugesan Keerthiram
Talamadupula Kartik
Publication venue
Publication date: 15/06/2021
Field of study

Text-based games (TBGs) have become a popular proving ground for the demonstration of learning-based agents that make decisions in quasi real-world settings. The crux of the problem for a reinforcement learning agent in such TBGs is identifying the objects in the world, and those objects' relations with that world. While the recent use of text-based resources for increasing an agent's knowledge and improving its generalization have shown promise, we posit in this paper that there is much yet to be learned from visual representations of these same worlds. Specifically, we propose to retrieve images that represent specific instances of text observations from the world and train our agents on such images. This improves the agent's overall understanding of the game 'scene' and objects' relationships to the world around them, and the variety of visual representations on offer allow the agent to generate a better generalization of a relationship. We show that incorporating such images improves the performance of agents in various TBG settings

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Targeted Advertising on Social Networks Using Online Variational Tensor Regression

Author: Abe Naoki
Bouneffouf Djallel
Idé Tsuyoshi
Murugesan Keerthiram
Publication venue
Publication date: 09/10/2022
Field of study

This paper is concerned with online targeted advertising on social networks. The main technical task we address is to estimate the activation probability for user pairs, which quantifies the influence one user may have on another towards purchasing decisions. This is a challenging task because one marketing episode typically involves a multitude of marketing campaigns/strategies of different products for highly diverse customers. In this paper, we propose what we believe is the first tensor-based contextual bandit framework for online targeted advertising. The proposed framework is designed to accommodate any number of feature vectors in the form of multi-mode tensor, thereby enabling to capture the heterogeneity that may exist over user preferences, products, and campaign strategies in a unified manner. To handle inter-dependency of tensor modes, we introduce an online variational algorithm with a mean-field approximation. We empirically confirm that the proposed TensorUCB algorithm achieves a significant improvement in influence maximization tasks over the benchmarks, which is attributable to its capability of capturing the user-product heterogeneity.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

Author: Atzeni Mattia
Campbell Murray
Kapanipathi Pavan
Kumaravel Sadhana
Murugesan Keerthiram
Sachan Mrinmaya
Shukla Pushkar
Talamadupula Kartik
Tesauro Gerald
Publication venue
Publication date: 08/10/2020
Field of study

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with $\epsilon$ -Greedy Exploration

Author: Chaudhury Subhajit
Chen Pin-Yu
Li Hongkang
Liu Miao
Liu Sijia
Lu Songtao
Murugesan Keerthiram
Wang Meng
Zhang Shuai
Publication venue
Publication date: 24/10/2023
Field of study

This paper provides a theoretical understanding of Deep Q-Network (DQN) with the

\varepsilon

-greedy exploration in deep reinforcement learning. Despite the tremendous empirical achievement of the DQN, its theoretical characterization remains underexplored. First, the exploration strategy is either impractical or ignored in the existing analysis. Second, in contrast to conventional Q-learning algorithms, the DQN employs the target network and experience replay to acquire an unbiased estimation of the mean-square Bellman error (MSBE) utilized in training the Q-network. However, the existing theoretical analysis of DQNs lacks convergence analysis or bypasses the technical challenges by deploying a significantly overparameterized neural network, which is not computationally efficient. This paper provides the first theoretical convergence and sample complexity analysis of the practical setting of DQNs with

\epsilon

-greedy policy. We prove an iterative procedure with decaying

\epsilon

converges to the optimal Q-value function geometrically. Moreover, a higher level of

\epsilon

values enlarges the region of convergence but slows down the convergence, while the opposite holds for a lower level of

\epsilon

values. Experiments justify our established theoretical insights on DQNs

arXiv.org e-Print Archive

MISMATCH: Fine-grained Evaluation of Machine-generated Text with Mismatch Error Types

Author: Abdelaziz Ibrahim
Chaudhury Subhajit
Crouse Maxwell
Dan Soham
Fokoue Achille
Gray Alexander
Gunasekara Chulaka
Kapanipathi Pavan
Mahajan Diwakar
Murugesan Keerthiram
Roukos Salim
Swaminathan Sarathkrishna
Publication venue
Publication date: 17/06/2023
Field of study

With the growing interest in large language models, the need for evaluating the quality of machine text compared to reference (typically human-generated) text has become focal attention. Most recent works focus either on task-specific evaluation metrics or study the properties of machine-generated text captured by the existing metrics. In this work, we propose a new evaluation scheme to model human judgments in 7 NLP tasks, based on the fine-grained mismatches between a pair of texts. Inspired by the recent efforts in several NLP tasks for fine-grained evaluation, we introduce a set of 13 mismatch error types such as spatial/geographic errors, entity errors, etc, to guide the model for better prediction of human judgments. We propose a neural framework for evaluating machine texts that uses these mismatch error types as auxiliary tasks and re-purposes the existing single-number evaluation metrics as additional scalar features, in addition to textual features extracted from the machine and reference texts. Our experiments reveal key insights about the existing metrics via the mismatch errors. We show that the mismatch errors between the sentence pairs on the held-out datasets from 7 NLP tasks align well with the human evaluation.Comment: Accepted at ACL 2023 (ACL Findings Long

arXiv.org e-Print Archive